neuro-symbolic translational precision medicine platform POC

team snet

2021-03-29

This report summarizes work done in the summer and fall of 2020 to prototype the integration of advanced machine learning techniques and Opencog symbolic inference for applications in applied biomedical research. Run the notebook from the cancer repo reports directory to reproduce this R notebook output.

data sources and normalization

curatedBreastData R Bioconductor compilation of microarray data sets

describe cbd package

describe coincide study subset

describe normalization for meta-analysis

spectral bigraphs

Spectral bigraphs1 spectal graph ref are a blah blah blah…

In the raw combined data we can easily see the study source bias. In particular, [GSE9893 & GSE20194 blah blah]. Note the two channel studies marked with crosses in the lower right and top center of the principle component plots (name 3 studies? where is third in first plot?)

batch mean centering

combat empirical bayes normalization

InfoGAN dimension reduction & clustering

describe infogan & include image from google doc

comparison with PAM50 clustering

survival curves

GSE20194

This study blah blah blah

logistic regression comparison

GSE9893

The outlier dataset examining relapse outcome in adjuvant tamoxifan treatment (PMID: 183471752 Chanrion M, Negre V, et al. A gene expression signature that can predict the recurrence of tamoxifen-treated primary breast cancer. Clin Cancer Res. 2008 Mar 15;14(6):1744-52. doi: 10.1158/1078-0432.CCR-07-1833; PMCID: PMC2912334.)

logistic regression comparison

RFS ~ 1 + p5 + age + node + radio + tumor + grade + pr vs RFS ~ 1 + pam + age + node + radio + tumor + grade + pr